Early Deletion of Fillers In Processing Conversational Speech

نویسندگان

  • Matthew Lease
  • Mark Johnson
چکیده

This paper evaluates the benefit of deleting fillers (e.g. you know, like) early in parsing conversational speech. Readability studies have shown that disfluencies (fillers and speech repairs) may be deleted from transcripts without compromising meaning (Jones et al., 2003), and deleting repairs prior to parsing has been shown to improve its accuracy (Charniak and Johnson, 2001). We explore whether this strategy of early deletion is also beneficial with regard to fillers. Reported experiments measure the effect of early deletion under in-domain and out-of-domain parser training conditions using a state-of-the-art parser (Charniak, 2000). While early deletion is found to yield only modest benefit for in-domain parsing, significant improvement is achieved for out-of-domain adaptation. This suggests a potentially broader role for disfluency modeling in adapting text-based tools for processing conversational speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Detection of Sentence Boundaries, Disfluencies, and Conversational Fillers in Spontaneous Speech

Automatic Detection of Sentence Boundaries, Disfluencies, and Conversational Fillers in Spontaneous Speech

متن کامل

Should Agents Speak Like, um, Humans? The Use of Conversational Fillers by Virtual Agents

We describe the design and evaluation of an agent that uses the fillers um and uh in its speech. We describe an empirical study of human-human dialogue, analyzing gaze behavior during the production of fillers and use this data to develop a model of agent-based gaze behavior. We find that speakers are significantly more likely to gaze away from their dialogue partner while uttering fillers, esp...

متن کامل

An Improved Model for Recognizing Disfluencies in Conversational Speech

This paper presents a novel metadata extraction (MDE) system for automatically detecting edited words, fillers, and self-interruption points in conversational speech. Our edit word detection sub-system combines a Tree Adjoining Grammar (TAG) noisy channel model, a statistical syntactic language model, and a MaxEnt reranker. Hand-built, deterministic rules are used to detect fillers. Self-interr...

متن کامل

A Model of Conversation Processing Based on Micro Conversational Events

I present a theory of discourse interpretation based on the hypothesis that the common ground of a conversation contains a record not only of complete speech acts, but, more in general, of each action of uttering a contribution to the conversation: single words, word fragments, and fillers. I call the action of uttering a ‘minimal’ contribution a MICRO CONVERSATIONAL EVENT. This model can serve...

متن کامل

Ethnomethodology and Conversational Analysis

In a speech community, people utilize their communicative competence which they have acquired from their society as part of their distinctive sociolinguistic identity. They negotiate and share meanings, because they have commonsense knowledge about the world, and have universal practical reasoning. Their commonsense knowledge is embodied in their language. Thus, not only does social life depend...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006